ECE 901 Lecture 7: PAC bounds and Concentration of Measure
نویسنده
چکیده
2 Agnostic Learning We will proceed without making any assumptions on the distribution PXY . This situation is often termed as Agnostic Learning. The root of the word agnostic literally means not known. The term agnostic learning is used to emphasize the fact that often, perhaps usually, we may have no prior knowledge about PXY and f∗. The question then arises about how we can reasonably select an f ∈ F in this setting.
منابع مشابه
ELEN6887 Lecture 7: PAC bounds and Concentration of Measure
2 Agnostic Learning We will proceed without making any assumptions on the distribution PXY . This situation is often termed as Agnostic Learning. The root of the word agnostic literally means not known. The term agnostic learning is used to emphasize the fact that often, perhaps usually, we may have no prior knowledge about PXY and f∗. The question then arises about how we can reasonably select...
متن کاملECE 901 Lecture 15: Denoising Smooth Functions with Unknown Smoothness
Lipschitz functions are interesting, but can be very rough (these can have many kinks). In many situations the functions can be much smoother. This is how you would model the temperature inside a museum room for example. Often we don’t know how smooth the function might be, so an interesting question is if we can adapt to the unknown smoothness. In this lecture we will use the Maximum Complexit...
متن کاملECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization
Yi i.i.d. ∼ pθ∗ , i = {1, . . . , n} where θ∗ ∈ Θ. We can view pθ∗ as a member of a parametric class of distributions, P = {pθ}θ∈Θ. Our goal is to use the observations {Yi} to select an appropriate distribution (e.g., model) from P. We would like the selected distribution to be close to pθ in some sense. We use the negative log-likelihood loss function, defined as l(θ, Yi) = − log pθ(Yi). The e...
متن کاملECE 901 Lecture 4: Estimation of Lipschitz smooth functions
Consider the following setting. Let Y = f∗(X) +W, where X is a random variable (r.v.) on X = [0, 1], W is a r.v. on Y = R, independent of X and satisfying E[W ] = 0 and E[W ] = σ <∞. Finally let f∗ : [0, 1]→ R be a function satisfying |f∗(t)− f∗(s)| ≤ L|t− s|, ∀t, s ∈ [0, 1], (1) where L > 0 is a constant. A function satisfying condition (1) is said to be Lipschitz on [0, 1]. Notice that such a...
متن کاملPAC-Bayesian Generalization Bound on Confusion Matrix for Multi-Class Classification
In this work, we propose a PAC-Bayes bound for the generalization risk of the Gibbs classifier in the multi-class classification framework. The novelty of our work is the critical use of the confusion matrix of a classifier as an error measure; this puts our contribution in the line of work aiming at dealing with performance measure that are richer than mere scalar criterion such as the misclas...
متن کامل